Hi all. I have a dataset with information from candidates that ran for the 2018 general election in Brazil. Each candidate has a tax code ID number called “CPF”. The CPF has 11 numbers and they all follow this format:
All CPF numbers have two dots and one hyphen separating the last two digits, as shows above.
I have another dataset that contains information for some of the candidates in the first dataset. It also contains CPF numbers. However, these CPFs omit some digits with asterisks (for data confidentiality purposes):
E.g.: ***.438.750-**
What I have in my dataset: a string variable nr_cpf_candidato with eleven numbers without the dots and the hyphen.
What I need: to have all these CPFs reorganized to follow the data confidentiality format (as above), which will allow me to merge data from both datasets using the reorganized CPF.
For instance: 12345678910 in the original dataset needs to be converted to ***.456.789-**
Question: How can I do that?
Part of the data follows below. Thank you.
All CPF numbers have two dots and one hyphen separating the last two digits, as shows above.
I have another dataset that contains information for some of the candidates in the first dataset. It also contains CPF numbers. However, these CPFs omit some digits with asterisks (for data confidentiality purposes):
E.g.: ***.438.750-**
What I have in my dataset: a string variable nr_cpf_candidato with eleven numbers without the dots and the hyphen.
What I need: to have all these CPFs reorganized to follow the data confidentiality format (as above), which will allow me to merge data from both datasets using the reorganized CPF.
For instance: 12345678910 in the original dataset needs to be converted to ***.456.789-**
Question: How can I do that?
Part of the data follows below. Thank you.
* Example generated by -dataex-. For more info, type help dataex clear input str61 nm_candidato str11 nr_cpf_candidato "AMANDA BARBOSA DA SILVA" "10153045400" "CARLOS ROBERTO DE ALMEIDA" "53154665749" "ROBERTO CAUNETO PICORELI" "02128498910" "DENISE RODRIGUES MATOS" "03861833760" "JANIER MOTA SANTOS PRIMO" "51587874504" "IZAC GONÇALVES DOS SANTOS" "06950326661" "ALCIONY REGIA SOARES SANTOS" "96699000420" "ELZA LUIZ DE QUEIROZ" "44615361653" "FLÁVIO DA SILVA DAMIANI" "69136955949" "FABIANO VERGINE TEIXEIRA DE SIQUEIRA" "22013906811" "FRANCISCA GEANY FELIPE DO NASCIMENTO" "82880891353" "ÁLVARO DAMIÃO VIEIRA DA PAZ" "67336361668" "OSVALDO LUIS FERREIRA DE SOUZA" "83069348491" "IVETE DA SILVA" "71279563400" "DENIS DUCK" "02920126830" "CLAUDIO FERREIRA SILVA" "13471410813" "MANOEL LEOCADIO DE MENEZES" "31471382249" "BRUNO ALBUQUERQUE TOLEDO" "01091273405" "JOCIANA MARIA DE SOUSA" "84000058304" "EVANDRO NEVIO ARGENTON" "49384104949" "LUCIMAURO ANTONIO ALVES OLIVEIRA" "62118706472" "FABIO LISANDRO DE LIMA BARROS" "48226645468" "VERA BISPO DOS SANTOS" "91083451120" "CLAUDINETE SENA CONCEIÇÃO" "58326880906" "SEBASTIÃO DA COSTA CANDIDO" "03539167730" end